This report accompanies The Century Foundation’s report “Exact Title TBD: Student Debt and Race in California,” which examines how student debt puts outsized financial burdens on Black and Latino families in California, especially for graduate borrowers and parents.
While California lawmakers rightly draw on national research on student debt and race, the state-specific analyses we conduct in this research can help guide state policy that accounts for the distinct patterns of borrowing in California. In particular, these analyses draw attention to the effects of uncapped Parent PLUS and Grad PLUS loans on California’s families.
In this report, we walk through the details of the data that underlie that report and explain in greater depth what we do and do not know about student debt in California. Our analysis relies on four sources:
We also draw from the Integrated Postsecondary Education Data System (IPEDS) and the American Community Survey (ACS) to lesser extents.
The structure of this data-focused report mirrors that of the policy-focused report: the charts and tables in the policy report draw from our four primary data sources in the same order that they are presented here. The first figures in the report draw from the FSA Data Center, and they are followed by figures that draw from NPSAS, and so on. This report is not a static PDF: you can toggle between different versions of charts and hover over barcharts and scatterplots to reveal data points.
This document is more descriptive than prescriptive. Even data queries did not show striking contrasts between groups are included for full transparency, so long as they are relevant to the examination of student loan debt’s burden on California borrowers.
The code used to produce this document and its charts and tables can be found at this GitHub repository. All data sets used in this report are publicly available, courtesy of the U.S. Department of Education, the U.S. Federal Reserve, and the U.S. Census Bureau.
For any questions, please email granville@tcf.org.
The FSA Data Center is a repository of data and statistics on federal student aid, including spreadsheets on student loans reported directly from the National Student Loan Data System. For this analysis we use two files from the FSA Data Center:
For “per capita” measures of student debt and borrowing below, the population for comparison is the estimated total of all California adults aged 18 to 50, using American Community Survey data reflecting calendar year 2021, available here.
| Measure | 50-state median | California value | California rank |
|---|---|---|---|
| Federal student loan debt per capita | $10,494 | $7,973 | 6 |
| Federal student loan borrowers per capita | 0.298 | 0.215 | 4 |
| Average federal student loan balance | $34,623 | $37,084 | 40 |
California has the most outstanding federal student loan debt and borrowers of any state, amounting to around 9 percent of the total portfolio.
| Measure | U.S. total | California value | California share |
|---|---|---|---|
| Total outstanding federal student loan debt | $1,505,800,000,000 | $141,800,000,000 | 9.4% |
| Total federal student loan borrowers | 41,874,000 | 3,824,000 | 9.1% |
Now we turn to the quarterly data on disbursements. To evade any impact from the COVID-19 pandemic on the data, we examine data from the 2018-19 award year.
| State | Subsidized | Unsubsidized undergraduate | Unsubsidized graduate | Parent PLUS loans | Grad PLUS |
|---|---|---|---|---|---|
| CA | 19.0% | 17.0% | 31.8% | 12.8% | 19.4% |
| U.S. | 21.7% | 22.7% | 29.8% | 14.0% | 11.8% |
Source: FSA Data Center.
Year(s) of analysis: Reflects loans distributed in 2018-19. Filtered for four-year institutions.
The breakdown of loans distributed by California institutions is skewed more towards Grad PLUS than the breakdown of loans distributed by institutions nationwide. Unsubsidized graduate loans also take up a larger slice of the pie in California than in the nation overall.
The statistics in the previous section have all been based on institution-level data. For the most robust information on the relationship between student loan borrowing and race in California, we need student-level data. In the absence of a national student-level data set, we use survey data from the National Postsecondary Student Aid Study (NPSAS).
NPSAS is the largest federal survey of U.S. college students with a primary focus on financial aid. The study has generally been conducted every four years, most recently in 2016 (NPSAS:16), with separate data sets examining undergraduate and graduate students.
NPSAS data sets have not traditionally been used for state-level analyses, although the sample size for California students can be large enough to produce reliable estimates depending on the query. This past year, a new edition of NPSAS that was designed for state-representative samples was released. Known as NPSAS-AC (Administrative Collection), it draws from student records housed by colleges and the U.S. Department of Education for a sample of 325,000 undergraduates. Representative samples for public 4-year systems are available in NPSAS-AC for 45 states, and representative samples for public 2-year systems are available for 36 states. Thirty states have representative samples for undergraduate students overall. More information on NPSAS-AC can be found here.
In this section we rely primarily on NPSAS-AC, which reflects the 2017-18 year. For analysis of graduate students, we use NPSAS:16 and filter for in-state students in California. (The in-state condition is required for this query.)
These data are accessed using the National Center on Education Statistics’ (NCES) Datalab tool. Every query has a unique table retrieval number that can be used by any user to run the query in Datalab.
The figure below compares average undergraduate borrowing by racial group at California public four-years, compared to U.S. public four-years. Across all groups, the average federal loan total is lower in California than nationwide. However, California resembles the U.S. in that Black undergraduates and their families borrow more than their peers. California differs from the U.S. in that Latino/a undergraduates in the state borrow more than white undergraduates, though this is only the case for direct loans to the students and not Parent PLUS.
Source: NPSAS:18-AC. Table retrieval number: lounfy.
Year(s) of analysis: Reflects students enrolled in the 2017-18 academic year. Only reflects undergraduate public 4-year institutions.
Note: Insufficient sample size for Native American / Alaskan Native population. Zeros are counted in the averages, meaning it includes those who took out no loans.
Source: NPSAS:16. Table retrieval number: psiqll. Data can be accessed at NCES Datalab.
Year(s) of analysis: Reflects students enrolled in the 2015-16 academic year. Only reflects public 4-year institutions.
Notes: Private loans are also used, but the average loans are so small that Datalab considers the estimates unreliable. Due to limitations of NPSAS:16, this only applies to in-state students.
Across all groups, graduate students in California borrow more than graduate students nationwide. Black students and those of two or more races show the highest average loans, upwards of $20,000 per year. It is striking how Black graduate students in California borrow an average that is roughly two-thirds higher than both the average for Black graduate students nationwide and the average for white graduate students in California.
Among California students, the sample is not sufficient for a breakdown by award level. For context, here is a breakdown of gaps in average loans across all graduate program levels, reflecting the national sample. This demonstrates how professional doctorates show an extreme version of the trends just described. Although professional doctoral students in California constitutes a very small subsample in NPSAS, we can explore this further using the College Scorecard later on.
NPSAS allows us to disaggregate by ethnicity, to a limited extent, for two racial groups (Hispanic and Asian). In California, average federal loans among Filipino undergraduates are higher than other Asian groups, matching a trend seen nationwide. Among Hispanic ethnicities in California, average loans are greatest among those of Puerto Rican descent.
Source: NPSAS, table retrieval number ddgpwg (U.S. all), jejvld (California in-state), tpsawz (U.S. in-state).
Note: Filtered for 4-year colleges and U.S. citizenship. Limited to in-state, undergraduate students. “Total federal loans” includes Parent PLUS. No other racial groups besides Hispanic and Asian have breakouts by ethnicity in NPSAS.
This is some information about SHED. Explain your process of variable selection: there are many but you chose the ones that seemed most relevant to the question of the financial burden of student loans. Emphasize here that the relationship between student loans and these variables may be chicken-and-egg, where we can’t say for sure what causes the other. Acknowledge the limitations of the sample size. Would it be nice to filter by institution level and control? Yes. Would it cut the sample size down very, very far? Yes. Small differences of one of two percentage points may be attributable to random noise that can raise from sampling. Remember to say that the respondents are heads of households, meaning that the population represented is that of U.S. adults. (Children are not included.) A good example of the kind of questions SHED answers that other datasets cannot: do borrowers think their education was worth it?
Here we tee up the findings.
xxx
xxx
xxx
Survey responses collected in 2020 and 2021 are not included due to
the federal student loan payment pause in place that started in March
2020 and continued through all of 2021.
*** ### {-}
xxx
Survey responses collected in 2020 and 2021 are not included due to the federal student loan payment pause in place that started in March 2020 and continued through all of 2021.
xxx
Figure 11 includes all respondents, not just those who have student debt for their own education. “Figure 11: Debt for spouse’s or partner’s education” does not include survey respondents without a spouse or partner. “Figure 11: Debt for child or grandchild’s education” does not include survey respondents without children or grandchildren. *** ### {-}
xxxx
xxx
xxx
xxx
xxx
xxx
xxx
xxx
xxx
xxx
xxx
Survey years 2020 and 2021 not included in “Figure 22: Ability to pay student loan bill” due to the federal student loan repayment pause. ***
xxx
Limited to those who have student loans. Survey years 2020 and 2021 not included. *** ### {-}
xxx
xxx
xxx
Add a lot of caveats before this next section.
It’s weird that the “Cover expenses with savings” bars are lower than the “Cover expenses by any means” bars.
xxx
Responses are not exclusive to each other. For an example, a respondent could say that they would sell belongings and use a payday loan. Because of this, groups’ bars cannot be stacked on top of each other in one chart and are instead presented here in a series of charts. *** ### {-}
xxx
xxx
xxx
xxx
xxx